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Abstract 

The tertiary structures of the SI and S2 domains of the spike protein of the coronavirus which is responsible of the severe acute 
respiratory syndrome (SARS) have been recently predicted. Here a molecular assembly of SARS coronavirus peplomer which 
accounts for the available functional data is suggested. The interaction between SI and S2 appears to be stabilised by a large hydro- 
phobic network of aromatic side chains present in both domains. This feature results to be common to all coronaviruses, suggesting 
potential targeting for drugs preventing coronavirus-related infections. 

© 2004 Elsevier Inc. All rights reserved. 
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We are now in a post-epidemic period of the severe 
acute respiratory syndrome (SARS), caused by the coro¬ 
navirus henceforth called SARS-CoV. Nevertheless, 
since the mode of transmission, spread, and mechanisms 
of virulence of SARS-CoV are not fully understood, all 
the possible weapons that Immunology and Pharmacol¬ 
ogy can provide should be prepared against the virus to 
defend ourselves better when this virus will rear again its 
infecting crown [1]. 

For a pharmacological approach the structural char¬ 
acterisation of the molecular repertoire of the target 
organism is of fundamental importance . In this respect, 
not much is available yet for SARS-CoV, as only two 
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crystallographic determinations [2,3] and few predictive 
models [4-6] are so far available. 

Among the above-mentioned structures, the pre¬ 
dicted structures of SI and S2 domains of the viral spike 
glycoprotein [5] can represent a rational basis to design 
specific antiviral drugs and diagnostic kits. This protein, 
indeed, has been found to be the viral membrane protein 
responsible of SARS-CoV cell entry by interacting with 
the receptor of the target cell and causing subsequent 
virus-cell fusion [7]. 

SARS-CoV ultra-high resolution images have been 
obtained [8] by scanning electron microscopy, SEM, 
which indicate that the spike glycoprotein is organised 
as a trimer. This finding offers a fundamental hint 
to investigate the overall assembly of the outer viral 
particles, peplomers, which give that characteristic 
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crown-like aspect to the virion, therefore classified in the 
coronaviridae family [9]. 

A stable quaternary structure without covalent cross- 
linking has been proposed, in general, for coronavirus 
peplomers [7]. This feature is consistent with our previ¬ 
ous structural predictions, as no Cys residue without a 
corresponding cystine-bridged counterpart is present in 
both models of SARS-CoV spike glycoprotein domains. 

The distribution of N-glycosylation and mutation 
sites has also been considered for a fine-tuning of the 
peplomer structural features with functional data. 


Materials and methods 

Peplomer model building has been performed on the basis of the 
structures of the SI and S2 domains of the SARS-CoV spike protein 
which are available in the Protein Data Bank [10]: the structure models 
1Q4Z and 1U4K have been used for SI and S2, respectively. Docking 
of the two domains has been manually performed and the reliability of 
each of the possible peplomer assemblies has been discussed according 
to the Prosall software package [11]. Accordingly, quaternary struc¬ 
tures exhibiting the lowest energies for atom pair and solvent inter¬ 
action were considered for further optimisation by using molecular 
dynamics simulations with Gromacs [12]. After a PROCHECK anal¬ 
ysis of the final refined peplomer structure it has been deposited in the 
Protein Data Bank with the ID code 1T7G. All displays of structures, 
as well as exposed surface area (ESA) calculations, were carried out 
with the program MOLMOL [13]. 

Results and discussion 

SARS coronavirus peplomer shape and dimensions 
are now well defined by recent SEM determinations 
[8], and the club-shaped protrusions of a trimer glyco- 

o 

protein appear to extend itself approximately 200 A 
from the virion envelope membrane with a maximum 
width of 100-200 A. 

It has been shown that coronaviruses present the SI 
domain as the globular head of the spike with receptor¬ 
binding activity and that the S2 domain is present in 
the stalk portion of the spike [14]. In this respect, the fact 
that SEM images clearly suggest that in the viral peplo¬ 
mers the spike glycoprotein is present as a trimer [8] re¬ 
sults to be a fundamental starting point for our model 
building procedure. This is also in accord with the gen¬ 
eral rule that coronavirus spike proteins form three- 
stranded left-handed coiled-coils. Moreover, the fact that 
the 320-518 fragment of SI domain has been identified as 
the SARS-CoV peplomer binding site to the ACE2 cellu¬ 
lar receptor [15] implies that the residues which are the 
most involved in the interaction with the receptor have 
to be positioned in the SI external top side. 

These first morphological and functional hints have 
been coupled to the results of a systematic search for 
surface hot spots of SI and S2 SARS-CoV domains, 
i.e., potential drug binding and/or protein-protein inter¬ 


action sites, to gain structural information on the rela¬ 
tive orientations of these SI and S2 domains. This 
analysis has been performed on the basis of SI and S2 
molecular models available [5]. Furthermore, a Clustal 
W [16] analysis of all the coronavirus spike proteins 
present in the SwissProt protein sequence data bank 
has been carried out and 236 sequences have been found 
to be compared with the one of SARS-CoV, SwissProt 
Accession No. P59594, originally used for our model 
building of the SI and S2 domains of the S glycoprotein. 

To build the molecular model of the SARS-CoV pep¬ 
lomer, the modelled structures of its SI and S2 domains 
have been used together with homology criteria with the 
quaternary assembly of other viral systems [14,17,18]. 

In the first step of the model building procedure, the 
positioning of each of the three S2 in respect to the oth¬ 
ers was carried out by assembling the long oe-helix span¬ 
ning residues 904-968, constituting the first heptad 
repeat (HR1, according to the prediction by Multicoil 
[19]), in a three-stranded, left-handed coiled-coil. The 
three HR Is were first aligned parallel along the major 
axis, and then rotated about the center of mass of the 
same amount to get the amino acids of positions a and 
t/justaxposed. The structure was refined by minimizaton 
followed by a simulated annealing dynamics. 

In the second step, possible interfaces between the SI 
and S2 domains, and among the SI + S2 components, 
needed for the assembly of a trimeric structure, have 
been systematically searched. 

Thus, in spite of the limited sequence homology, 
ranging from 20.39% to 27.63%, found for all these 
spike glycoproteins, in the SI hydrophobic pocket 
delimited by FI87, F334, F253, and W423 a high level 
of residue conservation is present. In this respect 
SARS-CoV, when compared with all the other coronav¬ 
iruses, is unique in its W/F swapping between position 
253 and 423, see Table 1. From Table 1 it can also be 

Table 1 


Amino acid occupancy of a possible strong binding site between SI 
and S2 domains 


Coronavirus 

Number 
of sequenced 
genomes 

187 

253 

334 

423 

782 

784 

Human sars 

36 

F 

F 

F 

w 

F 

F 

Murine 

16 

F 

W 

F,V 

Y 

L,V 

F 

Porcine( 1) 

1 

F 

w 

V 

Y 

F 

F 

Porcine(2) 

16 

F 

w 

v 

Y 

F 

F 

Porcine(3) 

30 

F 

w 

v 

Y 

F 

F 

Porcine(4) 

7 

F 

w 

v 

Y 

F 

F 

Bovine 

26 

Y 

w 

v 

Y 

F 

F 

Human 

24 

F 

w 

v 

Y 

L,I 

F 

Equine 

1 

Y 

w 

v 

Y 

F 

F 

Avian 

112 

F 

w 

v 

Y 

L,I,V 

F 

Feline 

54 

F 

w 

v 

Y 

F 

F 

Canine 

8 

Y 

w 

v 

Y 

F 

F 


(1), Hemagglutinating encephalomyelitis virus; (2), transmissible gas¬ 
troenteritis virus; (3), respiratory virus; and (4), epidemic diarrhea virus. 
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noticed that in the S2 domain the hydrophobic residues 
L803 and F805, totally conserved among the SARS- 
CoV available genomes and fully exposed in the S2 
molecular model, are located in a sequence position 
where only hydrophobic residues are found. 

The large difference in the pathologies induced by 
coronavirus infections suggests that a role for these 
two hydrophobic moieties of SI and S2 domains might 
be attributed to the peplomer assembly rather than to 
the interaction with the host cell. Residues FI87, 
F334, F253, and W423 could, indeed, form the SI 
hydrophobic pocket where S2 puts its hydrophobic fin¬ 
ger formed by L803 and F805 residues. 

The remarkable agreement between the steric require¬ 
ments for the S1-S2 interaction and the surface position 
of the proposed SI binding site to the ACE2 receptor 
points towards a finite orientation of these peplomeric 
domains, see Fig. 1. From this starting point, to recom¬ 
pose the full SARS-CoY peplomer from its components, 
we used the following simple criteria: (i) orienting each 

51 and the S2 stalk domains so that they could dock 
through the interaction described above, (ii) keeping 
all the potential N-glycosylation sites as surface exposed 
as possible, and (iii) positioning the largest hydrophobic 
surface patches in subunit interfaces. The fact that in the 

52 trimer the side chains of L803 and F805 residues are 
still surface accessible after the coiled coil formation 
supports the hypothesis that the peplomer reaches its 
structural stability through the hydrophobic interactions 
of the SI pocket with the S2 finger, as depicted in Fig. 2. 
Then, geometrical and energetic considerations con¬ 
verge towards possible solutions for the structure of 
the SARS-CoV peplomer. In Fig. 3, three molecules of 
S1-S2 adducts are positioned after their assembly, in a 
way which is consistent with the overall size of the pep- 



Fig. 1. The SI domain oriented to fit the morphological SEM images 
of [8]. In yellow and in red the potential N-glycosylation sites and the 
residues involved in the interaction with ACE2 [15] are, respectively, 
colored. (For interpretation of the references to color in this figure 
legend, the reader is referred to the web version of this paper.) 



Fig. 2. A detailed representation of the SI (dark blue) hydrophobic 
pocket interacting with the S2 (pale blue) finger. (For interpretation of 
the references to color in this figure legend, the reader is referred to the 
web version of this paper.) 



Fig. 3. Three spike glycoproteins form the peplomer of SARS-CoV 
coronavirus. 

lomer [8]. It should be noted also that, among the muta¬ 
tions which have been found in the SARS-CoV spike 
protein region of all the available genomes, see Table 
2, nothing occurs in the S1-S2 interfaces here identified. 
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Table 2 

Mutation site topology occurring in all the available strains of SARS-CoV spike glycoprotein 
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aa 

Mutation topology 

TOR2 

BJ01 

HZS2_C 

CUHK_LC2 

SOD 

HZS2JFC 

GZ_A 

HGZ8L1_A 

49 

Exposed top 

S 

S 

S 

S 

s 

S 

L 

S 

74 

Exposed top 

H 

H 

H 

H 

H 

H 

H 

F 

75 

Exposed top 

T 

T 

T 

T 

T 

T 

R 

T 

77 

Exposed top 

G 

D 

D 

G 

G 

G 

D 

D 

239 

Partially exposed side 

S 

S 

S 

S 

S 

S 

S 

L 

244 

Buried 

I 

T 

T 

I 

I 

I 

T 

T 

311 

Exposed side 

G 

G 

G 

G 

G 

G 

G 

R 

344 

Partially exposed side 

K 

K 

K 

K 

K 

K 

K 

R 

577 

Buried, interface SI-SI 

A 

S 

S 

S 

S 

S 

S 

S 

778 

Buried 

Y 

Y 

Y 

Y 

Y 

Y 

D 

D 

1148 

Exposed down 

L 

L 

L 

L 

F 

F 

L 

L 

1179 

Buried 

L 

L 

L 

L 

L 

L 

L 

F 

1208 

Non-modelled 

A 

A 

A 

V 

A 

A 

A 

A 


For the S2 moiety of the S glycoprotein, composed by 
one well-structured moiety containing the HR1 and an¬ 
other subdomain spanning the 1027-1195 segment of S 
glycoprotein and containing the HR2 (residues 1148— 
1193), ambiguity remains on the structure and on the 
location of the second in the peplomer structure. Such 
subdomain is critical for the interaction with the viral 
envelope, due to its proximity to the ^ra^s-membrane re¬ 
gion and for the overall structure stability of the peplo¬ 
mer. In fact, a peptide reproducing the C terminal 
heptad repeat fragment 1161-1187 of S2 exhibits antiv¬ 
iral activity [20]. 

Thus, 94% of SARS-CoV peplomer structure has 
been modelled and deposited with the pdb ID code 
1T7G. Accordingly, the most interesting regions to be 
reproduced in synthetic peptides for mimotope design 
have been found, as well as the hydrophobic sites 
distributed at the SI /SI, S2/S2, and S1/S2 interfaces 
for targeting of potential antiviral drugs (patent 
RM2004A000162). The fact that we could not model 
the 665-736 sequence of the S glycoprotein does not 
interfere with the exposed surface on top of the SARS- 
CoV peplomer, as the missing modelled moiety can be 
identified in a peplomer lateral region, where a deep 
groove is found. This peplomer region could be filled 
by the non-modelled part of the sequence, which consis¬ 
tently exhibits an extensive hydrophobic character [21]. 
In the present peplomer model, the so-called HR1 and 
HR2 moieties, i.e., the N and C terminal heptad repeat 
regions, respectively, spanning the sequences 904-975 
and 1148-1193, are not bound together. This feature is 
consistent with the extensive conformational change, 
necessary for this and several other viruses for the fuso¬ 
genic mechanism [22-24]. 

Then, on the basis of the obtained peplomer struc¬ 
ture, correlations can be explored between viral genome 
mutations and possible interactions with the host cell 
receptor(s). As reported in Table 2, on the basis of the 
proposed quaternary assembly a systematic topological 
analysis of mutation sites occurring in the 36 genomes 


so far available for the SARS-CoV spike glycoprotein 
has been done. It can be observed that (i) most of the 
mutations are found in exposed sites; (ii) the only 
mutation involving a relevant position for the receptor 
interaction, i.e., 344 K/R, is a conservative one; and 
(iii) non-conservative substitutions are found in the 
buried positions of residue 778 with Y/D, which do 
not induce sterical conflicts. 

The structural characterisation of SARS-CoV spike 
glycoprotein domains, here described, suggests also a 
general scheme for the peplomer assembly of all coro- 
naviruses. In fact, from the sequence alignment of the 
spike glycoproteins of all known coronaviruses, as 
shown in Table 1, it appears that the above-described 
hydrophobic interaction between the SI pocket and 
the S2 finger is very conserved. It could, therefore, rep¬ 
resent a very critical region for the interaction between 
SI and S2 domains for all coronaviruses, opening new 
perspectives for the design of small molecules that can 
efficiently interfere with the viral replication. Hence, 
SARS as well as all the other members of the coronavir- 
idae family could put down their infecting crown with 
the same type of antiviral drug, which could protect 
from possible transmission of coronavirus infections 
from wild animals. 
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